AITopics | fast similarity search

Collaborating Authors

fast similarity search

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multiscale Quantization for Fast Similarity Search

Neural Information Processing SystemsNov-21-2025, 15:58:43 GMT

We propose a multiscale quantization approach for fast similarity search on large, high-dimensional datasets. The key insight of the approach is that quantization methods, in particular product quantization, perform poorly when there is large variance in the norms of the data points. This is a common scenario for real-world datasets, especially when doing product quantization of residuals obtained from coarse vector quantization. To address this issue, we propose a multiscale formulation where we learn a separate scalar quantizer of the residual norm scales. All parameters are learned jointly in a stochastic gradient descent framework to minimize the overall quantization error. We provide theoretical motivation for the proposed technique and conduct comprehensive experiments on two large-scale public datasets, demonstrating substantial improvements in recall over existing state-of-the-art methods.

fast similarity search, multiscale quantization, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)

Add feedback

Fast Similarity Search via Optimal Sparse Lifting

Neural Information Processing SystemsNov-20-2025, 22:46:57 GMT

Similarity search is a fundamental problem in computing science with various applications and has attracted significant research attention, especially in large-scale search with high dimensions. Motivated by the evidence in biological science, our work develops a novel approach for similarity search. Fundamentally different from existing methods that typically reduce the dimension of the data to lessen the computational complexity and speed up the search, our approach projects the data into an even higher-dimensional space while ensuring the sparsity of the data in the output space, with the objective of further improving precision and speed. Specifically, our approach has two key steps. Firstly, it computes the optimal sparse lifting for given input samples and increases the dimension of the data while approximately preserving their pairwise similarity. Secondly, it seeks the optimal lifting operator that best maps input samples to the optimal sparse lifting. Computationally, both steps are modeled as optimization problems that can be efficiently and effectively solved by the Frank-Wolfe algorithm. Simple as it is, our approach has reported significantly improved results in empirical evaluations, and exhibited its high potentials in solving practical problems.

fast similarity search, name change, optimal sparse lifting, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.40)

Add feedback

HetFS: A Method for Fast Similarity Search with Ad-hoc Meta-paths on Heterogeneous Information Networks

Mao, Xuqi, Chen, Zhenyi, He, Zhenying, Jing, Yinan, Zhang, Kai, Wang, X. Sean

arXiv.org Artificial IntelligenceFeb-22-2025

Numerous real-world information networks form Heterogeneous Information Networks (HINs) with diverse objects and relations represented as nodes and edges in heterogeneous graphs. Similarity between nodes quantifies how closely two nodes resemble each other, mainly depending on the similarity of the nodes they are connected to, recursively. Users may be interested in only specific types of connections in the similarity definition, represented as meta-paths, i.e., a sequence of node and edge types. Existing Heterogeneous Graph Neural Network (HGNN)-based similarity search methods may accommodate meta-paths, but require retraining for different meta-paths. Conversely, existing path-based similarity search methods may switch flexibly between meta-paths but often suffer from lower accuracy, as they rely solely on path information. This paper proposes HetFS, a Fast Similarity method for ad-hoc queries with user-given meta-paths on Heterogeneous information networks. HetFS provides similarity results based on path information that satisfies the meta-path restriction, as well as node content. Extensive experiments demonstrate the effectiveness and efficiency of HetFS in addressing ad-hoc queries, outperforming state-of-the-art HGNNs and path-based approaches, and showing strong performance in downstream applications, including link prediction, node classification, and clustering.

ad-hoc meta-path, fast similarity search, heterogeneous information network, (1 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11280-024-01303-1

2502.16288

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Information Management > Search (0.97)

Add feedback

Angular Quantization-based Binary Codes for Fast Similarity Search

Neural Information Processing SystemsApr-6-2023, 12:43:29 GMT

This paper focuses on the problem of learning binary embeddings for efficient retrieval of high-dimensional non-negative data. Such data typically arises in a large number of vision and text applications where counts or frequencies are used as features. Also, cosine distance is commonly used as a measure of dissimilarity between such vectors. In this work, we introduce a novel spherical quantization scheme to generate binary embedding of such data and analyze its properties. The number of quantization landmarks in this scheme grows exponentially with data dimensionality resulting in low-distortion quantization.

angular quantization-based binary code, fast similarity search, landmark

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.45)

Add feedback

Angular Quantization-based Binary Codes for Fast Similarity Search

Gong, Yunchao, Kumar, Sanjiv, Verma, Vishal, Lazebnik, Svetlana

Neural Information Processing SystemsFeb-14-2020, 22:41:35 GMT

angular quantization-based binary code, fast similarity search, landmark

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Multiscale Quantization for Fast Similarity Search

Wu, Xiang, Guo, Ruiqi, Suresh, Ananda Theertha, Kumar, Sanjiv, Holtmann-Rice, Daniel N., Simcha, David, Yu, Felix

Neural Information Processing SystemsFeb-14-2020, 17:56:50 GMT

We propose a multiscale quantization approach for fast similarity search on large, high-dimensional datasets. The key insight of the approach is that quantization methods, in particular product quantization, perform poorly when there is large variance in the norms of the data points. This is a common scenario for real- world datasets, especially when doing product quantization of residuals obtained from coarse vector quantization. To address this issue, we propose a multiscale formulation where we learn a separate scalar quantizer of the residual norm scales. All parameters are learned jointly in a stochastic gradient descent framework to minimize the overall quantization error.

fast similarity search, multiscale quantization, product quantization, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.67)

Add feedback